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Abstract 

The growing amount of data available in modern-day 
datasets makes the need to efficiently search and retrieve 
information. To make large-scale search feasible, Dis¬ 
tance Estimation and Subset Indexing are the main ap¬ 
proaches. Although binary coding has been popular for im¬ 
plementing both techniques, n-ary coding (known as Prod¬ 
uct Quantization) is also very effective for Distance Esti¬ 
mation. However, their relative performance has not been 
studied for Subset Indexing. We investigate whether binary 
or n-ary coding works better under different retrieval strate¬ 
gies. This leads to the design of a new n-ary coding method, 
’’Linear Subspace Quantization (ESQ)” which, unlike other 
n-ary encoders, can be used as a similarity-preserving em¬ 
bedding. Experiments on image retrieval show that when 
Distance Estimation is used, n-ary ESQ outperforms other 
methods. However, when Subset Indexing is applied, inter¬ 
estingly, binary codings are more effective and binary ESQ 
achieves the best accuracy. 


1. Introduction 

Large-scale retrieval has attracted a growing attention in 
recent years due to the need for image search based on vi¬ 
sual content and the availability of large-scale datasets. This 
paper focuses on the problem of approximate nearest neigh¬ 
bor (ANN) search for large-scale retrieval. 

Approaches for solving this problem generally fall into 
two subcategories; Fast Distance Estimation [ 7 , 12] and 
Fast Subset Indexing[ 14, 13, 2, 10, 1]. Fast Distance Esti¬ 
mation methods reduce computation cost by approximating 
the distance function. Distance computation is very expen¬ 
sive in high dimensional feature spaces. On the other hand. 
Fast Subset Indexing methods reduce the cost by constrain¬ 
ing the search space for a query to a subset of the dataset 
instead of the whole dataset. 

A general technique for ANN search (both Fast Distance 
Estimation and Fast Subset Indexing) is to discritize the fea- 
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Figure 1: Difference between Subspace Clustering (Part A) and Sub¬ 
space Reduction (Part B) for generating n-ary codes. The example shows 
the simple case of generating 2-dimensional 3-ary codes. In Subspace 
Clustering, data is clustered into 3 clusters in the two defined subspaces 
(e.g. the code for the red cross is [2, 2]^). In Subspace Reduction, data 
is transformed into a two dimensional space and each dimension is dis¬ 
cretized into 3 bins (e.g. the code for the red cross is [2,3]^). 

ture space into K regions. Different coding methods can be 
used for this purpose. One of the classic methods is one-hot 
encoding using iT-means. iT-means is a classic quantiza¬ 
tion technique that quantizes data into K regions (clusters). 
Data is coded using a K bit binary code, in which only one 
bit is one (representing the appropriate cluster) . Although 
AT-means works well for small values of AT, it becomes in¬ 
tractable for large K. 

An alternative method to one-hot encoding using K- 
means is binary coding. One can code the K clusters with 
m = log 2 {K)^ dimensional binary codes by relaxing the 
one-hot encoding constraint and allowing multiple bits to 
be one. This is equivalent to partitioning the space into two 
regions m times. Many binary coding methods have been 
designed to address this problem[2, 5, 11, 16]. While binary 
coding is more scalable, it has a high reconstruction error. 

Binary coding can be relaxed by allowing each dimen- 

Tn this paper, without loss of generality, we assume that K is selected 
such that m is a natural number. 
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sion to be n-ary instead of binary (i.e. take on integer 
values between 1 and n). In this case, K clusters can be 
coded with m = logn{K) dimensional n-ary codes. We 
introduce a new categorization for methods that generate 
n-ary codes. We explore two general approaches to gener¬ 
ate m-dimensional n-ary codes: 1- Subspace Clustering: 
In this approach, the original feature space is divided into 
m subspaces and each subspace is quantized into n clus¬ 
ters. 2- Subspace Reduction: Here, the dimensionality 
of the original feature space is reduced to m and each di¬ 
mension is quantized into n bins. Figure 1 illustrates these 
approaches. Multi-dimensional quantization methods (e.g. 
PQ, CK-means)[4, 12, 7] adopt the first approach to per¬ 
form n-ary coding. They solve the problem for any m and 
n (including n = 2 which leads to binary coding based on 
the first approach). On the other hand, many binary cod¬ 
ing methods(e.g. ITQ, LSH) [5, 16, 2, 11] are instances of 
the second approach. However they are limited to the case 
where n = 2. 

Most recent papers on quantization [4, 12], compared 
their proposed methods with binary coding methods only 
with respect to Distance Estimation (i.e. typically employ¬ 
ing exhaustive search over the data, where the approximated 
distance is mimicking the ordering of images based on Eu¬ 
clidean distance in the original feature space). This leaves 
the question of ’’which binary or n-ary coding performs bet¬ 
ter for ANN search using Subset IndexingT unanswered. 

The contributions of this paper are twofold. First, a 
new general approach for multi-dimensional n-ary coding 
is introduced. Based on that. Linear Subspace Quantiza¬ 
tion (LSQ) is proposed as a new multi-dimensional n-ary 
encoder. Unlike previously proposed n-ary encoders in 
which the Euclidean distance between n-ary codes is not 
preserved, the distances in LSQ coded space correlate with 
the Euclidean distance in the original space. As a result, 
the codes can be used directly for learning tasks. Further¬ 
more, LSQ does not make the restrictive assumption of di¬ 
viding space into independent subspaces, which is common 
in n-ary encoders. Experiments show that LSQ outper¬ 
forms such encoders. Second, it is shown that n-ary cod¬ 
ing does not always outperform binary coding in retrieval. 
We show that binary coding works better when Subset In¬ 
dexing is used and present an explanation based on the two 
approaches to coding. To the best of our knowledge this has 
not been identified previously. However, it is very important 
for large-scale retrieval. 

The rest of paper is organized as follows. In section 2, 
the general formulation for both Subspace Clustering and 
Subspace Reduction is presented. Additionally, the LSQ 
coding method is described and its relation to other methods 
and its properties are discussed. Section 3, describes the 
ways n-ary and binary coding methods can be exploited in 
combination with distance estimation and subset indexing 


to reduce search cost in retrieval. Experiments are reported 
in Section 4. Finally, Section 5 concludes the paper. 

2. n-ary Coding 

ANN search methods discretize the feature space into a 
set of disjoint regions, n-ary coding can be used for this 
purpose. An n-ary code of length m is defined by an m- 
dimensional vector in{l,2,...,n}’^. The goal is to trans¬ 
form data into m-dimensional n-ary codes that reconstruct 
the original data accurately. 

First, consider constructing a on^-dimensional n-ary 
code. A common objective for quantization methods is 
to minimize the reconstruction error, referred to as quan¬ 
tization error. In other words, given a set of data points 
X G where each column is a data point x G 

the quantization objective can be expressed as: 

mm{||X-g(X)fp} (1) 

where Q maps a vector x (column in X) into one element of 
a finite set of vectors C = {ci, C 2 ,... c^} in referred to 
as a codebook. The index of the codebook vector assigned 
to a data point is its one-dimensional n-ary code. AT-means 
optimizes this objective when the size of the codebook is 
equal to K. 

Using the one-hot encoding notation, the optimization in 
1 can be written as follows: 

min{||X-CB|||} (2) 

where B G {0,1}^^^ is a binary matrix in which each 
column is a 1-way selector - all of its elements but one are 
zero. 

In order to generalize one-dimensional n-ary codes to m- 
dimensional codes, we explore two approaches: Sub space 
Clustering and Sub space Reduction. Although the former 
has been explored in the literature, the latter has not. With¬ 
out loss of generality, we assume that the data are mean 
centered (i.e. mean(X) = 0i:)xi) and scaled to [—1,1] by 
mapping the data to the unit hyper-sphere. In [15, 5], it is 
shown that it is very beneficial to normalize the data to the 
unit hyper-sphere. 

2.1. Subspace Clustering 

Here, to generate m-dimensional n-ary codes, the origi¬ 
nal feature space is divided into m subspaces and each sub¬ 
space is discretized into n regions. To this end, in 2, the 
number of clusters can be set to AT = and the selector 
can be allowed to include m non-zero elements as follows: 


niin{||X- [ Cl I C 2 I ... I 
B,C 
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B 
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Here and are the codebook and its related one-hot 
encoding in the i’th subspace. In general, the optimization 
of 3 is intractable. As a result, Product Quantization[7] and 
Cartesian K-means [12] solve a constrained version of this 
problem where the subspaces created by the C^s are orthog¬ 
onal. In other words, i = Onxn}- In¬ 

tuitively, the original space is divided into m independent 
subspaces and each is clustered into n regions. We next 
present another approach to n-ary coding in which no such 
constraint is imposed. 


be noted that an m-dimensional n-ary code is generated by 

9n(/(X)). 

2.2.1 Linear Subspace Quantization (LSQ) 

LSQ is a multidimensional n-ary coding method based on 
Subspace Reduction where linear functions are used as the 
mapping and the reconstruction functions in 4. Assume that 
/(x) = W'''x and /(x) = V'''x where W G and 

V G Employing the Frobenius norm as the regu¬ 

larizing function, the optimization problem in 4 becomes: 

min{||X - vTg„(wTx)||| + A||V||1.} (6) 

To solve this problem, we propose a two step iterative 
algorithm. Subsequently, the convergence of the proposed 
algorithm will be proven. 


2.2. Subspace Reduction 

Subspace Reduction maps the data into an m- 
dimensional space and discretizes each dimension into n 
bins. The goal is to perform this discretization to minimize 
the reconstruction error in the original space. Formally, the 
optimization problem can be written as follows: 

min{||X - /(g„(/(X)))||2p + Ai?(/)} (4) 

fj 


where / : i-^ is the mapping function and is applied 

to each column of X, / : (m < D) is the 

reconstruction function which projects the data back to the 
D dimensional space in which the reconstruction error is 
computed. In order to prevent overfitting, the reconstruction 
function must be regularized. R is a. regularizing function 
that limits the variations in /, A is a parameter controlling 
the amount of regularization, and Qn is a uniform quantizer 
that is applied to each element of its input and is defined as: 


qn{x) = < 


On{l) 

0n(2) 


e„(i)+e„(2) 


X < 
< X < 


e„(i)+e„(2) 

en(2)+e„(3) 


(5) 


^ <^n(n-l)+a„(n) < ^ 

where 6*„(i) = — 1 + generates n uniformly dis- 

tributed values in [—1,1]. In other words, qn{x) is a gen¬ 
eral quantizer that maps any real value in [-1,1] into one 
of n uniformly distributed values in [—1,1]. For example, 
q 2 {x) is the sign function {Vx > 0| sign{x) = 1, Vx < 
0| sign{x) = —1} and q 3 {x) maps x into one of the three 
values {—1,0,1}. 

To summarize, optimizing 4 identifies a mapping and 
a reconstruction function such that the quantized data in 
the space generated by the mapping function can be recon¬ 
structed accurately by the reconstruction function. It should 


• Learning LSQ: 

The optimization for W and V in 6 can be solved by 
a two step iterative optimization algorithm (i.e. fixing one 
variable and updating the other). The steps are as follows: 

1. Fix W and update V; For a fixed W, define H = 
^n(W^X); then we have a closed form solution for V 
as 


V = (HH'r + AI)-iHX'r (7) 

2. Fix V and update W; In this step, W is updated as: 

W = V^' (8) 

where is the Moore Penrose pseudoinverse of V. In 
the following we prove that the pseudoinverse is an op¬ 
timal solution for 6 when V is fixed. 

The algorithm iterates between step 1 and 2 until there is no 
progress in minimizing 6. 

• Convergence of LSQ: 

In order to prove the convergence of the algorithm, we show 
that both steps reduce the objective value. The optimality of 
the first step can be easily shown by simple linear algebra. 
Here, we focus on proving that the second step reduces the 
objective value. 

Given that argmin{||B — AY||} = argmin{|| A’^'B — 

Y Y 

Y||}, the solution of the optimization in 6 for fixed V is 
equivalent to the solution of the following problem: 

imn{||Vt''x-g„(WTx)|||.} (9) 

w 

Defining M = V^^X, the optimal solution for W can 
be formulated as: 














2.2.2 LSQ as an n-ary Embedding 



Figure 2: LSQ fitting 2D data using a quantizer. Data points are 
shown by orange crosses. The blue circles indicate the quantization levels 
in different dimensions. LSQ reduces the reconstruction error by perform¬ 
ing a linear transformation over the quantized hyper-cube. 


W* =argmin{||M - q„iW^X)\\%} (lo) 

w 

It should be noted that the optimal solution is not unique. 
Therefore, W is defined as the optimal solution set for W*. 
The goal is to prove that G W. 

Let A* = W*^X. We first prove that gn(A*) = 
q'^(M). Suppose, to the contrary, that g'n(A* ) ^ qn{M). 
Consequently, there should be at least one i and j, such 
that qn{A*{i,j)) ^ qn{M{i,j)). Since g„(A*) is de- 
fined in the optimal solution of the optimization 10, its 
corresponding objective value should be less than that of 
any other feasible point. This leads to the conclusion that 
- qn{A*{i,j))f < {M{i,j) - qn{M{i,j))f 
(Note that even if more than one element differs between 
qn{A*) and the inequality holds for at least one of 

them). However, this contradicts the definition of qn{x) in 5 
since qn{x) should map M{i,j) into qn{A'^{i,j)) (It should 
be noted that qn{A*{i,j)) is in the range of qn{M{i, j))). 
So for any i and j, qn{A^{iJ)) = qn{M{iJ)). Therefore, 
gn(M) = g^(A*). Considering the definition of M and A* 
completes the proof that G W. 

Finally, since both steps in our optimization reduce the 
objective value, LSQ converges to a local optimal value of 
optimization 6. 

• Relation to ITQ: 

ITQ is a special case of LSQ when n = 2 and W, V are ro¬ 
tation matrixes where W=V^. Our experiments show that 
the binary codes generated by LSQ leads to higher accura¬ 
cies than the binary codes generated by ITQ. 


While binary encoding techniques try to minimize the re¬ 
construction error, the resulting codes preserve similarities 
between samples. In other words, the Hamming distance 
in the binary space approximates the Euclidean distance in 
the original feature space. As a consequence of this prop¬ 
erty, these binary codes can be exploited as feature vec¬ 
tors for learning tasks in the embedded space. Many recent 
approaches based on this property have been proposed to 
make learning more efficient 116, 17]. 

In subspace clustering methods (e.g. CK-means), the 
cluster indices generated by the quantizer can not be viewed 
as a similarity preserving embedding. This is due to the fact 
that there are no constraints on assigning these indices to 
clusters. In subspace reduction methods (e.g. LSQ), each 
dimension of an n-ary code has a finite(discrete) set of real 
values as its domain. For each dimension, the distances be¬ 
tween these discrete values correlates with the distances be¬ 
tween the data points in the original feature space in the di¬ 
rection of that dimension. Therefore, the Euclidean distance 
in the quantized data correlates with the Euclidean distance 
in the original feature space. 

One could post process CK-means to generate similar- 
ity(distance) preserving codes by assigning the appropriate 
indices to cluster centers after completion of the training 
stage. These indices can be obtained by finding a ID sub¬ 
space for each of the subspaces generated by CK-means. A 
simple model could compute PCA over the cluster centers 
in each subspace to reduce the cluster centers into ID real 
values. However, in 4.5, a classification experiment is per¬ 
formed in which the n-ary codes are used as features. The 
result shows that, as an embedding, LSQ outperforms CK- 
Means by a large margin even after refining the CK-Means 
index assignments to clusters. 

3. X-NN Retrieval using Data Encoding 

A large source of computational cost in nearest neigh¬ 
bor search is the distance computation between the query 
and all the samples in the dataset. In order to speed up 
A-NN search, one can either speed up the computation of 
the distance function and/or reduce the number of distance 
computations by limiting the search space for a given query. 
We refer to the first strategy as Distance Estimation and the 
second as Subset Indexing. In the following subsections, 
we show how Subspace Clustering and Subspace Reduction 
coding techniques can be used for each of these strategies. 


• Geometrical Interpretation: 


3.1. Retrieval by Distance Estimation 


LSQ finds a linear transformation of a quantized hyper-cube 
that best fits the data. Figure 2 illustrates a simple 2D exam¬ 
ple in which the gs quantizer is fit to the data by a rotation. 


Data coding can reduce the cost of distance computation 
since the Euclidean distance can be efficiently estimated in 
the coding space. 





• Distance estimation using Subspace Clustering n-ary 
codes: 

Once data is coded, the Euclidean distance between two 
points can be estimated as the sum of distances between 
the assigned cluster centers to those data points in each sub¬ 
space [7]. This is known as the symmetric distance. The 
distances between the n cluster centers in each subspace 
can be pre-computed in an n x n table. Then, computing 
the symmetric distance can be implemented efficiently by 
m look-ups and additions of table elements, one for each 
subspace. More formally. 



(a) 


(b) 


m 

d{x,y) =YlLi^c{ui{-x.)),c{ui{y))) (11) 

where Ui{x.) project x into the subspace, c{u) is the clus¬ 
ter index to which u belongs, and Li is the pre-computed 
distance table for the subspace. If we consider x as the 
query and y as a data point from the database, c{ui{y))) 
can be pre-computed. Therefore the complexity is 0{mN) 
for each query, where N is the total number of points in the 
database. 

• Distance estimation using subspace reduction n-ary 
codes: 

As mentioned earlier, the Euclidean distance between 
quantized data by subspace reduction approximates the Eu¬ 
clidean distance in the original feature space. Therefore we 
need only compute the distance between coded values. This 
has complexity 0{mN), which is the same as the complex¬ 
ity of subspace clustering. 

• Distance estimation using binary codes: 

Eor the binary codes, Hamming distance is used as the dis¬ 
tance metric. Computing Hamming distances using m-bit 
binary codes has complexity 0{mN) for each query. 

3.2. Retrieval by Subset Indexing 

Another way to speed up nearest neighbor search is to 
limit the search space. Hashing techniques [2] and tree 
based methods [14] limit the search space by constraining 
search to a subset of samples in the database. This is ac¬ 
complished by indexing the data into a data structure (e.g. 
hash tables or search tree) at training time. Multiple index 
hashing [6, 13] is one such data structure that can be used 
for binary and n-ary codes. 

• Multiple index hashing using n-ary codes: 

In this approach, for m-dimensional n-ary coding, we cre¬ 
ate an index table 7^, i = 1,..., m for each dimension. 
Each table has n tuples {Idxj, Cj)^ j = 1,..., n. where 
Idxj corresponds to one of the n values in a dimension of 
the code and Cj is a list of those data points’ indices such 


Eigure 3: Retrieval using Multiple Index Hashing (a) n-ary Coding: 
each dimension of the query is a key index to its corresponding table. Each 
table has n rows which points to set of samples in the base set with the 
same value in that dimension, (b) Binary Coding: In this case, binary 
codes are partitioned into sets of b consecutive bits (here b = 3) and each 
set is used as a key index for the corresponding table. Tables have 2^ rows 
containing samples with the same value in the corresponding bits. 

that the value of the dimension in their code is Idxj. At 
query time, for each dimension of the code, a set of data in¬ 
dices is retrieved. Eigure 3(a) illustrates this technique. Eor 
each index in the union of these sets, we assign a score s 
between 1 and m which indicates that a particular index has 
been retrieved from s dimensions. The samples with higher 
scores are more likely to be similar to the query sample. 
By sorting the indices based on their score, we can choose 
the top-AT samples as the AT-NN’s. If the total number of re¬ 
trieved indices were less than AT, we change the value in one 
of dimensions in the query code that has minimum distance 
to the quantized query point in the original space. Then we 
retrieve a new set and repeat the process until the total size 
of the retrieval set is greater than or equal to AT. 

• Multiple index hashing with binary codes: 

Similar to n-ary codes, binary codes can be used for multi¬ 
ple index hashing. However, in this case each set of b con¬ 
secutive bits are grouped together to create the indices for 
accessing the tables. Considering b = log{n), this parti¬ 
tioned binary code can be seen as an n-ary code. As a re¬ 
sult, the same technique can be applied for multiple index 
hashing as discussed previously. This case can be seen in 
Eigure 3(b). 

3.2.1 Subset Indexing: Binary or n-ary Coding? 

As mentioned earlier, n-ary coding does not always out¬ 
performs binary coding for large-scale retrieval. More pre¬ 
cisely, when Subset Indexing is used to reduce the search 
cost, binary coding achieves better search accuracy. This is 
due to the fact that quantization does not necessarily pre¬ 
serve the similarities (or distances) between data. In other 
words, a good quantizer Q that minimizes the quantization 


























Figure 4: Binary vs. n-ary for ANN 

error in 1, does not always preserve relative distances be¬ 
tween data. Formally: 

||xi -X2II < ||xi -Xsll ^ 

IIQ(xi) - Q(x 2 )|| < IIQ(xi) - (3 (x3)|| 

This is important when retrieval is carried out by sub¬ 
set indexing. There, binary codes may retrieve the nearest 
neighbors better than n-ary codes. Figure 4 illustrate an ex¬ 
ample of 2-dimensional 8-ary codes and their correspond¬ 
ing binary codes, which have 6 bits (6 = 21og(8)). Each 
bit is generated by a line based on which side of the line 
the point lies. The green dots are the points in the database 
and the red diamond is a query point. In this figure the bi¬ 
nary code for the query sample is 110000. In the subspace 
clustering view, we cluster each dimension into 8 clusters. 
In this case, all points in the yellow region will be retrieved 
by multiple index hashing. As can be seen, none of the ac¬ 
tual nearest neighbors can be retrieved. But, when we use 
the binary codes for multiple index hashing all the actual 
nearest neighbors are retrieved. This is the blue region (i.e. 
the union of the region created by the first three bits (110) 
and the second three bits (000) of the query code). Our ex¬ 
perimental evaluation confirms that when subset indexing 
is used for retrieval, binary codes outperform n-ary codes. 
Although, n-ary codes are more accurate for quantization, 
they are not accurate for ANN with subset indexing. 

4. Experiments 

We report experiments on three well-known datasets, 
namely GISTIM [7], CIFARIO [9], and a subset of Ima- 
geNet [3] which is used for the ILSVRC2012 challenge. 
GISTIM contains IM base feature vectors, 500K training 
samples and IK query samples. For CIFARIO, we ran¬ 
domly selected 20K samples as our training set, 500 sam¬ 
ples as query images and the remaining 39500 images as 
the base samples. Raw pixel values are used as features 
for this dataset. The ImageNet ILSVRC2012 dataset con¬ 
sists of 500K training samples, 250K base samples, and IK 


query images. We used ConvNet as the state-of-the-art fea¬ 
ture extractor for this dataset. The ConvNet features are 
extracted by Caffe [8]. 

Following [12], we used recall as the performance mea¬ 
sure for retrieval. The training set is used to train the coding 
model and the learned model is applied for coding the base 
and query set. For each point in the query set, we find its R 
nearest neighbors and report the recall at R. By varying R 
we draw the recall curves. 

As mentioned earlier, retrieval can be made faster using 
two approaches: distance estimation and subset indexing. 
The performance of different methods can vary with respect 
to which approach is used. Therefore, each method is exam¬ 
ined with respect to both and an analysis is presented. The 
nearest neighbors in the original feature space are defined 
as ground truth for each query image. For making the com¬ 
parison fair, in each experiment the number of bits which 
can be used by each coding method is limited to the same 
fixed budget, e.g. a 2 dimensional 8-ary code requires 6 bits 
of memory. (3 bits per code dimension). 

4.1. Retrieval using Distance Estimation 

Figure 5, shows the Recall @R curves on different 
datasets using a budget of 256 bits. In this figure, LSQ(N) 
and LSQ(B) refer to the n-ary and binary versions of the 
LSQ method respectively. The recall@R curves are shown 
for different number of bits per code dimension, which con¬ 
trols the number of quantization steps for n-ary encoders 
(e.g. for LSQ(N)-5 or LSQ(B)-5 the quantizer has 32 lev¬ 
els or 5 bits). As can be seen, the performance of n-ary 
codes is better than binary codes. Also, LSQ outperforms 
CK-means on all three datasets. 

Figure 6 explores the effect of the number of bits on the 
different methods. We fixed the number of bits per code 
dimension to 5 (e.g. the CK-means algorithm would learn 
32 clusters per segment) and report the area under the Re¬ 
call @R curve. Again, LSQ performs better than CK-means. 

4.2. Retrieval using Subset Indexing 

As discussed in section 3.2, either binary or n-ary coding 
can be used to speed up search with Subset Indexing. This 
approach limits the search to a small number of samples by 
indexing subsets of the database (subset indexing). Here, 
the performance of binary and n-ary coding is compared. 
We compare the retrieval results of the best n-ary encoding 
for this task(CK-means) to the best binary coding(the binary 
version of LSQ). 

Ligure 7, shows the recall @R curves for this experiment 
with varying numbers of bits per code dimension for a fixed 
budget of 256 bits. In N-ary-k, k bits are used for quantizing 
each dimension and additionally indexing in the multi-index 
hashing method(i.e. 2^ quantization steps for each dimen¬ 
sion). Similarly, in Binary-k, k consecutive bits are used for 
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Figure 5: The Recall@R curves for retrieval using distance estimation for 256 bits. Each diagram shows the curve for different methods and different 
numbers of bits per code dimension, (a) Results on CIFARIO dataset, (b) Results on GISTIM dataset, (c) Results on ImageNet dataset. 
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Figure 6: The Area Under Recall @R curves for retrieval using 5 bits 
per code dimension (i.e. 32 quantization steps). Each diagram shows the 
curve for different methods and different amount of total bit budget, (a) 
Results on CIFARIO dataset, (b) Results on ImageNet dataset. 


indexing in the hashing method. The effect of changing the 
budget of the encoder on the retrieval task can be seen in 
Figure 8. These figures illustrate that the binary encoding 
techniques outperform the n-ary encoders when the subset 
indexing technique is used, as discussed in Section 3.2. 

Discussion: These experiments confirm that when re¬ 
trieval is performed by distance estimation, it is better to 
use n-ary coding with n > 2 based on subspace reduction 
(e.g. LSQ). On the other hand, when subset indexing is 
used for retrieval, binary coding outperforms n-ary coding. 

4.3. Comparison of binary coding methods 

Both CK-means and LSQ can be viewed as generaliza¬ 
tions of binary encoding where the number of quantization 
steps can be more than two. Here, the number of quantiza¬ 
tion steps is set to two and the binary versions of LSQ and 
CK-means (namely LSQ(B) and OK-means respectively) 
are compared with ITQ using subset indexing. Figure 9, 
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Figure 8: The Area Under Recall @R curves for retrieval using 5 bits 
per code dimension. Each diagram shows the curve for the best n-ary and 
binary coding method for this task and different bit budgets, (a) Results 
on CIFARIO dataset, (b) Results on GISTIM dataset, (c) Results on Ima¬ 
geNet dataset. 



Figure 9: The Area Under precision recall curves for binary methods. 
Each diagram shows the curve for different methods and different bit bud¬ 
gets. (a) Results on GISTIM dataset, (b) Results on ImageNet dataset. 


shows the area under the recall precision curve for these 
three binary coding methods under varying bit budgets. As 
can be seen, the binary version of LSQ outperforms ITQ 
and the binary version of CK-means. 



































GISTIM: Recal @R for 256 bits 


CIFARIO: Recal @R for 256 bits 



ImageNet: Recal @R for 256 bits 
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Figure 7: The Recall@R curves for retrieval using subset indexing for 256 bits. Each diagram shows the curve for the best n-ary and binary coding 
method in this task (CKmeans and Binary LSQ respectively), (a) Results on CIFARIO dataset, (b) Results on GISTIM dataset, (c) Results on ImageNet 
dataset. 
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Figure 10: The convergence of different coding algorithms, (a) LSQ(B) 
(b) ITQ (c) OK-Means. Note different scale of reconstruction error are 
required. 


4.4. Convergence of the algorithms 

In Figure 10, the convergence of different binary cod¬ 
ing methods are shown. For this experiment, GISTIM is 
used. As can be seen, LSQ converges much faster than OK- 
Means. Also note that the final reconstruction error of LSQ 
is much smaller than ITQ and OK-means, reflecting the fact 
that LSQ reconstructs the data more accurately using the 
same memory budget. 

4.5. n-ary Codes as Feature Vectors 

The codes constructed by LSQ can be used as feature 
vectors to perform learning tasks. Figure 11 shows the 
performance evaluation of a classification task using dif¬ 
ferent codings as features. As proposed in sec 2.2.2, for 
CK-means, we refine the index assignments to clusters by 
mapping the cluster centers in each subspace into a one di¬ 
mensional space using PCA and convert each dimension of 
the code to the corresponding value in this ID space. It can 
be seen that our proposed quantization method outperforms 
CK-means even after refining the CK-Means index assign¬ 
ments. 


CIFARIO: Accuracy 



Figure 11: Classification accuracy using the codes of different methods 
as feature vectors. 


5. Conclusion 


We focused on the problem of large scale retrieval us¬ 
ing ANN. A new general approach for multi-dimensional 
n-ary coding -Linear Subspace Quantization (LSQ)- was 
introduced for ANN. LSQ achieves lower reconstruction er¬ 
ror than other n-ary coding methods. Furthermore, it pre¬ 
serve the similarities in the original space, which is impor¬ 
tant when it is used directly for learning tasks. Experiments 
show that LSQ outperforms other binary and n-ary coding 
methods on large scale image retrieval. We also compared 
the performance of binary and n-ary coding methods for this 
task. We showed that n-ary coding outperforms binary cod¬ 
ing when distance estimation is used to reduce the search 
computation cost. However, in combination with Subset 
Indexing, interestingly, binary coding works better for re¬ 
trieval. 
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